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Poisson representation techniques provide a powerful method for mapping master equations for birth/death 
processes — found in many fields of physics, chemistry and biology — into more tractable stochastic differen- 
tial equations. However, the usual expansion is not exact in the presence of boundary terms, which commonly 
occur when the differential equations are nonlinear. In this paper, a gauge Poisson technique is introduced that 
eliminates boundary terms, to give an exact representation as a weighted rate equation with stochastic terms. 
These methods provide novel techniques for calculating and understanding the effects of number correlations 
in systems that have a master equation description. As examples, correlations induced by strong mutations in 
genetics, and the astrophysical problem of molecule formation on microscopic grain surfaces are analyzed. Ex- 
act analytic results are obtained that can be compared with numerical simulations, demonstrating that stochastic 
gauge techniques can give exact results where standard Poisson expansions are not able to. 



I. INTRODUCTION 

The calculation and prediction of the behavior of com- 
plex systems is one of the most pressing issues in theoretical 
physics[l] and in many related fields. A common difficulty 
when dealing with statistical problems is that the state-space 
of possible outcomes is enormous. This is particularly so for 
quantum systems — but very similar issues can arise in many 
types of master equation, with applications ranging from ki- 
netic theory to genetics. One of the earliest approaches to 
this problem was the method of Langevin equations in Brow- 
nian motion, which led to the theory of equations with random 
terms, or stochastic equations. An important subsequent de- 
velopment in the field of discrete master equations was the van 
Kampen system-size expansion[2], which leads to an approx- 
imate Fokker-Planck equation equivalent to a stochastic equa- 
tion — whose deterministic part has the usual rate-equation 
behavior. Following this, a more systematic technique was in- 
troduced, called the Poisson[3] expansion. This gives an ex- 
act Fokker-Planck equation in cases with linear rate equations, 
and does not require a system-size expansion. 

The general advantage of Poisson methods is that they em- 
ploy a 'natural' basis in which the distribution is expanded in 
the most entropically likely distribution for linear couplings. 
This allows for very efficient treatment of the underlying Pois- 
sonian statistics. The disadvantage is that Poisson methods 
involve a complex extension to the usual real space of number 
densities. This can result in large errors — both random and 
systematic — when nonlinear interactions generate unstable 
trajectories in the complex space. These are solutions to the 
deterministic drift equations which can reach infinity in a fi- 
nite time. Just as in related quantum phase-space methods [4], 
such trajectories result in power-law distribution tails which 
can cause large sampling errors. These can also give rise to 
systematic boundary term errors [5], since the derivation of the 
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Fokker-Planck equation requires that the resulting distribution 
can be integrated by parts with vanishing boundary terms. 

In this paper, a new method called the gauge Poisson repre- 
sentation is introduced. This removes any unstable trajectories 
or moving singularities, which are conjectured[6] to be the 
cause of boundary term errors — thus allowing an exact map- 
ping of many important master equations into stochastic dif- 
ferential equations, even when the Poisson expansion cannot 
be used. This is an example of a stochastic gauge[7], in which 
stochastic equations are modified by introducing an equiva- 
lence class of gauges that stabilize all complex trajectories. I 
find that correctly chosen stochastic gauges appear to elimi- 
nate the boundary term problem, and simultaneously give rise 
to greatly reduced sampling errors in practical numerical solu- 
tions. Numerical simulations are presented for specific cases 
to verify that gauges with no moving singularities lead to ex- 
act results. 

To illustrate the technique, I show how the gauge method 
can be used to calculate means and correlation predictions 
from master equations. The first example describes genetic 
mutations in a simple model from evolutionary biology [8]. 
The second treats the astrophysical problem of interstellar 
molecular hydrogen production on grain surfaces [9]. The 
examples show relatively simple behavior that leads to sub- 
Poissonian results in cases where there is an analytic the- 
ory available to compare with numerical simulations. This 
demonstrates that correct results can be obtained in the 
stochastic gauge simulations, with a range of stabilizing 
gauges — even when incorrect results are obtained with the 
standard Poisson expansion, due to boundary terms. At the 
same time, these master equations have a great deal of intrin- 
sic scientific interest. 

In many cases, there are more degrees of freedom with cor- 
relations and fluctuations that are closer to Poissonian. A type 
of problem where stochastic gauge methods would be useful is 
the treatment of complex systems where there are large num- 
bers of modes, like a lattice or spatially extended continuum 
model. Correlations in these types of system are also able to 
be treated, in principle, by these methods. Other examples 
are cases where the statistics that are dominated by some rate- 
limiting step involving only small numbers. Details of these 
applications will be given elsewhere. 
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II. BIRTH-DEATH MASTER EQUATIONS 

In master equations, the fundamental object is a probabil- 
ity distribution P(N,f) = [P(?)] N for observing probabilities 
of discrete outcomes, labeled with integers N = (Ni , . . . ,A^). 
These numbers are typically the number of particles or 
atoms (in physics), molecules (in chemistry) or organisms (in 
biology)[l]. The numbers may refer to a large, well-mixed 
volume, or to cells within a larger volume in the case of spa- 
tially extended systems. A common and very significant prob- 
lem is the Markovian time-evolution of the distribution, de- 
fined by an N x N matrix M so that: 



dt 



P(f)=M-P(f) 



(2.1) 



There are severe complexity issues that arise in trying to 
solve this numerically as d, the number of modes or dimen- 
sions, increases (unless the problem is exactly soluble or fac- 
torisable, which is rarely the case in practice). The difficulty 
of solving this equation directly is that even when the maxi- 
mum number is bounded by N max , the total number of discrete 
states N involved is which grows exponentially large 

with the number of distinct modes d. Thus, direct methods 
are not suitable for solving many problems of this type. 

The most general Markovian master equation considered 
here describes a number of coupled reactions, labeled with an 
index a. Each reaction has the following generic structure: 



j 



k a 



j 



(2.2) 



The reactions are restricted to be at most binary, so that 
j < 2, rf,V a j = 0, 1,2, and ff = Zjtf < 2, V fl = E,-^ < 2. 
The probability of a transition per unit time is proportional 
to the rate k" and the number of initial 'particles', giving a 
reaction rate of: 



R a (N)=k a Y\- Njl 



(2.3) 



The traditional rate equation for the process (2.2) , which 
ignores fluctuations, assumes deterministic changes for Nj S> 
1 according to: 



/i=I(^-vjr(N) 



(2.4) 



By contrast, equation (2.1) describes both mean values and 
fluctuations. In this case the master equation for the proba- 
bility of the outcome N — including fluctuations — has the 
form: 



— P(N) = £[fl fl (N")P(N a ) -R a (N)P(N)} (2.5) 

where NJ = Nj + V " — p" is the particle number prior to reac- 
tion (a) that leads to a current number Nj. 



The propagation matrix M can now be constructed from a 
class of matrix ladder operators which either increase (Lt) 
the number of particles in a particular mode j, or decrease 
(LJ) the number of particles and multiply the probability by 
a factor of (Nj + 1 ) . In terms of the probability vector P, this 
means that: 



L|P 



L 7 P 



N 



= P(N l ,...N j -l,...) 

= (Nj + l)P(N h ...Nj + l,...). (2.6) 



Combining products of these together gives the result that: 

l a* /_ \v_-| _ (Nj+v-fj)! 



N (Nj — fj) 



-P(N l ,...N j +v-n,...) 

(2.7) 

This is precisely the matrix operation required to construct 
the master-equation matrix. Hence, after using the identities 
for the raising and lowering operators, one finds that the mas- 
ter equation matrix has a factorized structure given by: 



(2.8) 



A. Poisson Representation 

In this section, the results of the Poisson representation[3] 
will be summarized. This important development employs an 
expansion of the distribution vector P using 'prototype' solu- 
tions, namely the complex Poisson distribution po(oc), without 
requiring a system-size expansion: 



[Po(oc) 



d 



■(ajp/Nj\ 



(2.9) 



The positive Poisson representation^] expands the distribu- 
tion vector P with a quasi-probability, /(a), defined over a 
complex <f -dimensional phase-space of variables a. 

P = J f(a)p Q (a)d 2d a . (2.10) 

Here the discrete variable N, which is a vector of integers, is 
transformed into a continuous variable a — which is a vector 
of complex numbers. In the above form it is conventional to 
choose f (a) as positive, so that it behaves much like a conven- 
tional probability. It is also possible to make other choices. 
For example, the complex Poisson representation employs a 
complex contour integral form which is useful for finding ex- 
act solutions in special cases. In this expansion, 

P = ^/(a) Po (a)/o. (2.11) 

With the aid of differential identities given in the next sub- 
section, either expansion can be used to change the master 
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equation given above into a differential form. Introducing a 
generalised measure d/j(<x) to indicate either a volume or con- 
tour integral, and a differential operator l' p that is determined 
by the propagation matrix M, 



dt 



p(0 



/(o)M-po(o)d/i(o) 



I /(a) [4p (a)] dn{a) . (2.12) 



Next, provided that the relevant boundary terms vanish, par- 
tial integration results in a modified form with a differential 
operator L p acting on the distribution rather than the basis 
itself, where L p conventionally is written with all derivative 
operators on the left: 

|p(0 = j [x P /(o)]po(o)d/i(o) . (2.13) 

This can then be used to deduce that a sufficient condition 
for the distribution function /(a) is that it should satisfy a 
partial-differential equation of Fokker-Planck form: 



(2.14) 



This form is valid in either the positive or the complex Pois- 
son representation. Since a contour integral can be chosen to 
be closed, or to have a direction in phase-space which gives 
rise to an exponentially damped behaviour, it is generally pos- 
sible to choose a contour that gives rise to vanishing bound- 
ary terms for a complex Poisson representation. The issue is 
more difficult in the case of the positive Poisson representa- 
tion, since the extended (complex) phase-space may include 
directions where the Fokker-Planck solutions have power-law 
tails which are not sufficiently bounded. In such cases, the 
presence of boundary terms mean the technique is no longer 
exact. 

In the positive Poisson case, provided the partial differential 
equation is of second order and has positive-definite diffusion, 
an equivalent stochastic differential equation is obtained. The 
crucial point is that this final equation has just linear rather 
than exponential growth in the problem size, as the number of 
modes d increases. 

Observables are calculated using the result that the m-th 
factorial moment is now given by a probabilistic average in 
the positive Poisson representation: 



the actual numbers may be small in at least one of the 
steps. Examples of this are common in problems involving 
nano-structures — like the grains involved in astrophysical 
molecule production [9]. Other potential applications include 
genetic population dynamics [8], where population numbers 
in small regions are also critically important to reproduction, 
and spatially dependent master equations for diffusion or ki- 
netic processes. Direct Monte-Carlo simulations can be used 
in these problems [10], but these can be inefficient and time- 
consuming for large numbers of modes, since they do not 
make any use of the fact that most of the populations involved 
may be nearly Poissonian. 

As explained in the Introduction, the Poisson method for 
birth-death master equations has similar properties to the 
positive-P[4] representation in quantum mechanics. It is ex- 
act if the resulting differential equation is linear, but there are 
problems when there are nonlinear terms in the equations. If 
the Poisson distribution has power-law tails at large radius, 
then the resulting transformation develops systematic bound- 
ary term errors. These typically develop when unstable trajec- 
tories occur[5] in the drift terms of the corresponding stochas- 
tic equations, which can reach the boundary at infinity in a 
finite time. This is an intrinsic problem in the derivation of 
the Fokker-Planck form, Eq (2.14), from the earlier integro- 
differential equation, Eq (2.12). 



B. Fokker-Planck equation 

For the particular master equations considered here, the 
Fokker-Planck equation is easily constructed from the ma- 
trix factorization given in Eq (2.8). The ladder operators obey 
identities as follows, when acting on a Poisson distribution: 



Lj po(oc) = a/po(a) 
L+ P o(a) = (l + d;) Po (a) 



(2.16) 



Since po(ct) is analytic in a, 3 symbolizes either 



: d/dxj 



or — ; 



complex variables a, — ^ 1 
(2.16), together with Eq (2.8), one obtains: 



d/dyj for each of the j =1,... ,d 
x i + iyj . Using the identities of Eq 



. j J 



(2.17) 



(Nj(Nj-l)...(Nj-m)) 



ajf(a)d»(a) 



a" 



(2.15) 



with a similar contour integral result for the complex Poisson 
representation. 

The Poisson expansion is best thought of as providing a 
systematic procedure that can replace approximations valid 
for large enough numbers Nj, including rate-equations and 
system-size expansions[l]. Such large-number approxima- 
tions are inapplicable to many important problems where 



where the Poisson reaction rate R a (OL) corresponds to the de- 
terministic reaction rate when (X = N, and is given by: 



R a (a) = k a Yla/ 



(2.18) 



Partial integration is used next, in order to obtain a differ- 
ential operator Lp acting on the distribution / rather than the 
expansion kernel po(cc). This is most conveniently carried 
out using generalised spherical coordinates, so that there is 
one boundary at large radius r , where boundary terms should 
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vanish. More than one type of generalised radius is possible, 
and it is useful to define 



£e ( -|oc ( -|. 



(2.19) 



where e, is a multiplicity factor, and p > 1 defines the power 
law used to obtain the radial norm. Conventional hyperspher- 
ical coordinates are obtained if p = 2, e, = 1 . 

Since the kernel can grow as fast as e r r N (for 9?(oc) < 0) , a 
sufficient condition to have vanishing boundary terms is that 
the distribution should vanish faster than e~^ r , where X > 1 . 
Since the oscillatory nature of the kernel for 3 (a) ^ can 
cause cancellation of boundary terms, this may not always 
be necessary. From Eq (2.15), a sufficient condition to ob- 
tain a well-defined observable moment of order m, is that the 
distribution should vanish at r — > oo as r-< M+m ) or faster. If 
the distribution vanishes faster than all finite power laws, the 
stochastic equations have a well-defined set of moment equa- 
tions which are identical to the moment equations of the origi- 
nal master equation. Provided these have a unique solution for 
a given initial condition, the less stringent condition that all 
moments exist is presumably sufficient to ensure that bound- 
ary terms vanish at r —* °°. 

After partial integration (provided boundary terms vanish) 
the following Fokker-Planck equation is found: 



dt 



= L P f 



(2.20) 



na-^-Ift 1 - 3 ^ 



R a f(a) 



In cases of interest involving at most binary kinetics, only 
first and second order derivatives occur, giving a differential 
operator in the form: 



L P =Apj + -D±d i dj. 



(2.21) 



Here the repeated Latin indices i,j are summed over i — 
l,...,d, so that At is a <f -component complex vector called 
the drift vector, while Df- is a d x d square complex symmet- 
ric matrix called the diffusion matrix. 

The basic drift and diffusion matrices [3] are given on in- 
spection of the Fokker-Planck equation (2.20), on considering 
all possible values and combinations of /^,v": 
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= E(A<?-v?K(a) 



D tj = 1 £Wj-5 ij )-v a j (v a j -5 ij )]R a (a) 



(2.22) 



C. Stochastic Equations 

To obtain stochastic equations, it is necessary to take a ma- 
trix square root to generate the d x d' noise matrix B, where 



D+ = BB' 



(2.23) 



The lack of uniqueness of matrix square-roots allows arbitrary 
functions in phase-space to be introduced, called diffusion 
gauges [7, 11]. As an example of this, it is always possible 
to choose a diffusion gauge corresponding to separate matri- 
ces for each different reaction a, so that: 



D+ = £VB a B 



aT 



(2.24) 



With this choice, the noises are always proportional to \/R". 
Since each reaction has an individual noise matrix of size d x 
d a , the total noise dimension d' is given by d' ~Y. a d a - If 
required, it is also possible to increase the noise dimension 
to d' = 2d + Y* u d a , by adding d matrix terms B' which have 
nonzero entries only in the j — th row : 



B 



j = 8f(a) 

V2 











... 1 i 



... 



These have the property that B'B /T = 0, so therefore they do 
not alter the diffusion matrix. In addition to these gauges that 
change the noise dimension, it is also possible to use orthogo- 
nal transformations on B which keep the dimension invariant, 
but alter the noise correlations. 

The Fokker-Planck differential operator acting on the dis- 
tribution / is then transformed into a stochastic differential 
equation by taking advantage of the equivalent analytic forms 
in the differential operators, as described in more detail in the 
next section. The result is an Ito stochastic differential equa- 
tion: 

= A+(ce) + Z t V&B%b(t)+fl(aK J (t) ,(2.25) 

a 

where the functions i^(f) are delta-correlated Gaussian real 
noise terms, with: 



Ut% j (t'))=8 ij 8(t- 



(2.26) 



The stochastic functions ^(f) are delta-correlated Gaussian 
complex noise terms, which give rise to a gauge symmetry, in 
that they have no effect on the resulting moments: 



mm')) 
urn*')) 



Sy5('-0 
0. 



(2.27) 



The difficulty with the positive Poisson method[3] outlined 
above is that even normally stable drift equations can become 
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unstable due to movable singularities in this extended com- 
plex phase-space, which become accessible when the noise 
term develops a complex part. The standard term 'movable 
singularity' [12] describes any solution which can reach infin- 
ity in a finite time, depending on the initial conditions. The 
singularity therefore 'moves' with the initial conditions, rather 
than occurring at a fixed time. 

These singular trajectories themselves are rare, and may 
form a set of measure zero on the extended phase-space. How- 
ever, just one singularity has been shown in an earlier study 
of typical examples to lead to Fokker-Planck equations with 
power-law tails, that do not vanish sufficiently quickly at the 
phase-space boundaries [6]. This leads to systematic boundary 
term errors in the results, as well as greatly increased numeri- 
cal integration and sampling errors. 



The important advantage of the Poisson method is that these 
equations have an identical form to simple rate-equations, yet 
they are exact, and include all the relevant statistics. Since the 
Green's function is a delta-function, an initially bounded dis- 
tribution remains bounded, and there are no boundary terms. 



E. Dimerization 

To illustrate the procedure in nonlinear cases, consider a 
dimerization process: 



Xi + Xi -> X2 . 



(2.34) 



The master equation can be represented using the elementary 
matrix operators as: 



D. Unary reactions 

As an illustration, I will consider some elementary types 
of reaction, to demonstrate the derivation given here. These 
results are readily generalized to the stochastic gauge case 
treated later. 

First, consider one-species reactions — this can be a simple 
isomerization or cell diffusion with a rate y, of the form: 



^P = fc[L+-L+L+]Lj-L5"P. (2.35) 

The corresponding differential operator is: 

Lp = ka\ [d 2 -2di-3f] . (2.36) 

As long as partial integration is permissible (which is ques- 
tionable here) the Fokker-Planck equation would be: 



Xi^X 2 



(2.28) 



In this case, the master-equation reaction matrix has the usual 
property that the probability of an event is proportional to the 
rate and the initial number of particles N x . The master equa- 
tion is therefore: 



dt 



P(N) = yNfP(Nf,N 2 



-YMP(N) , 



(2.29) 



where Nj =Nj±l. This can also be represented using matri- 
ces as: 



-P = y[L+-L+] LJ-P. 



(2.30) 



In this case the corresponding differential operator is: 

4 = iaiP2-di] ■ (2.31) 

On transforming into a Fokker-Planck equation, one ob- 
tains: 



-/(a) = y[3 1 -3 2 ]a 1 /(a). 



(2.32) 



Hence, the deterministic differential equation for the charac- 
teristics, which are noise-free in this case, are: 



dt 
da.2 
dt 



= -yoci 
= yoci . 



(2.33) 



dt 



/(a) = k [23i - d 2 - dj] a?/(a) . (2.37) 



Hence, the corresponding stochastic differential equations are: 



day 
dt 

da.2 
dt 



—— = -2kai + iaiV2kC,(t) 



kaj , 



(2.38) 



where (C(f)C(f'))=8(f-f') • 

Only the first equation needs to considered in detail, as it is 
autonomous. In this case, on defining dimensionless variables 
X = 2kt, n = ai, the ungauged Poisson equation reduces to the 
form 



dn 



= —n 



-wr|(x) 



(2.39) 



with (r)(x)r|(x')> =5(x-x'). 

In recent studies of these equations, clear evidence was 
found of substantial numerical errors [13]. To understand this, 
note that there is a movable singularity in this drift equation 
of the form n(x) = l/(x — Xo). In stochastic calculations, it 
is found that random trajectories are generated for negative 
initial conditions (due to the noise term), and these can be ar- 
bitrarily close to the singularity. 

To show the analytic consequences of the singularity[5], 
consider the inverse variable z — 1/n, which has the linear 
Ito stochastic equation: 



dz _ 
dx ~ 



izri(x) . 



(2.40) 
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This equation has a uniform noiseless flow at z = 0, with no 
absorbing submanifold. Hence it has a continuous distribution 
without a zero in the inverse variable distribution fi nv (z)- On 
transforming back to the original variables, the Jacobean of 
the transformation generates a power law tail, with f(n) °< 
l/|n| 4 = 1/r 4 . From the earlier analysis of boundary terms, 
this power-law tail is insufficient to ensure the existence of any 
distribution moments. One must therefore expect systematic 
errors due to boundary terms which do not vanish on partial 
integration. 

F. Generic binary reactions 

As a final illustration, consider a generic binary interaction, 
in which two species are transformed at a rate k into two new 
species: 

X l +X 2 ±X 3 +X 4 . (2.41) 
The master equation is: 

|p(N) = k(N+)(N+)P(N+, N+,N^,N 4 ) 

-kNiN 2 P(N) . (2.42) 

This can also be represented using matrices as: 

^P = /t[L|L|-L+L^]L]"L2P. (2.43) 

Hence, in this case: 

4 = toc 1 a 2 [(l+a 3 )(l+a 4 )-(l+3i)(l+32)] 
= fcaia2[33 + 34-3i -32 + 3334-3i32] • (2.44) 

In the present example, this procedure — which is only 
valid if boundary terms vanish during the partial integration 
— would result in a stochastic equation for the complex Pois- 
son mean variables a,. On introducing the complex 'reaction 
rate', R = ka\a 2 , the resulting Ito equations are: 

^ = -R + iy/Rjl^-ib) 

^ = R + y/R/2&-ib) ■ (2-45) 

This shows very clearly a useful property of the Poisson 
method. The modified particle statistics caused by nonlin- 
ear reactions are immediately apparent from the noise terms, 
since any fluctuations represent a departure from Poisson 
statistics. 

As in the previous example, singular trajectories can occur 
at negative values of a, which can be reached via stochastic 



motion in the complex plane. Singularities like this often ex- 
ist in complex nonlinear equations of polynomial form, since 
these systems are generically non-integrable or even chaotic 
— and the Painleve conjecture[14] states that movable sin- 
gularities are to be expected in analytically continued non- 
integrable sets of equations. Thus, in this case also the bound- 
ary terms may not vanish, leading to systematic errors. Al- 
though such errors are known to be exponentially small when 
there is large linear damping, they can cause problems when 
there is little or no linear damping. 



G. Numerical Simulations 

The results of direct numerical simulations can be used to 
test the accuracy of a stochastic method. The numerical re- 
sults included in this paper are simple examples where the 
detailed results of simulations can be evaluated in exactly sol- 
uble cases. 



1. Stratonovich calculus 

Stochastic calculus is normally carried out in one of two 
different forms. The first is the Ito calculus, where all terms 
that multiply a stochastic noise are evaluated before carrying 
out the stochastic step forward in time. This is the simplest 
form, and corresponds directly to the coefficients in the type 
of Fokker-Planck equation used elsewhere in this section. The 
second is the Stratonovich form, where all the multiplicative 
terms are evaluated implicitly at the midpoint of a given step 
forward in time. This form corresponds to taking the wide- 
band limit of a finite band-width stochastic equation, and fol- 
lows more standard calculus rules for variable-changes. 

For numerical simulations [16] it is generally more efficient 
to use Stratonovich equations [3] — in which the Ito drift term 
is modified in a standard way to allow central difference al- 
gorithms to be employed. In a generic Ito equation like Eq 
(2.25), the Stratonovich method generates a modified drift 
term Aj, where: 

A^Aj-^BafiiBfi. (2.46) 

The resulting equations can be used directly in stable implicit 
central-difference algorithms, which are robust and well- 
suited to the present nonlinear equations. Here the (possible) 
ambiguity in the analytic differential notation is immaterial, 
since by construction the noise matrix Bj^ is analytic or mero- 
morphic. 



2. Error estimation 

Discretization error can be estimated by comparing simula- 
tions with different step-sizes, but identical underlying noise 
sources. This error was typically of order 10~ 3 in the sim- 
ulations in this paper. The algorithm used was an iterative 
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implicit central difference method[16], which directly imple- 
ments the Stratonovich form of the stochastic equation. All 
numerical code was generated in C++ including estimators 
of both the sampling error and discretization error, using an 
XML script and an automatic code generator[17] obtained 
from the XMDS project web-site. 

As an estimator of sampling error, I use the Gaussian es- 
timator of the standard deviation in the mean, a g = <5/y/W s . 
However, more sophisticated estimators must be used when 
the results are strongly non-Gaussian. To ensure that the re- 
sults were a strong test of the stochastic gauge method, a large 
number of samples (N s = 10 6 ) were used in the numerical 
calculations reported here, so as to give low sampling errors 

G g . 

III. GAUGE POISSON REPRESENTATION 

As shown in the previous section, the positive Poisson 
method may have systematic errors in cases involving nonlin- 
ear drift, due to boundary terms on partial integration caused 
by unstable trajectories. The gauge Poisson representation in- 
troduced here treats the problem of boundary terms, by uti- 
lizing a gauge technique similar to that recently proposed for 
the positive-P distribution^]. It adds an extra variable to the 
distribution, which eliminates instabilities by modifying the 
dynamical equations. A type of gauge-invariance allows this 
to be carried out exactly. The gauge equations retain the ad- 
vantages of the Poisson method, but have no boundary term 
errors for suitably chosen gauges. This is essential for correct 
results. 



|.p(f) = J [L G G(aj] v(u)d 2 Q.d 2d a (3.3) 

However, the crucial distinction between this method and 
the usual Poisson method is that the introduction of a weight- 
ing factor in the basis means that there are now additional 
identities available. This allows the differential operator L G 
to be chosen so that the resulting time-evolution of the dis- 
tribution G( a ) remains sufficiently compact at all times to 
guarantee that boundary terms vanish. 

So far this is similar to the positive Poisson 
representation[3]. However, the m— th factorial moment 
is now given by a weighted average, with £2 as a complex 
weighting parameter in the averages: 

(Nj(Nj - 1) ... {Nj - m)) = J (Q.aJ) G(a)d 2 Q.d 2d a 

= (Oaf) = ((aj)} , (3.4) 

where the notation (( )) for a complex Poisson variable 

means a weighted stochastic gauge average. From this one 
obtains the expected result that in a pure Poisson distribution 
with G(a) = 8(£i l)Yl d j =l 8((Xj - <Xj), the mean and vari- 
ance of modes with j > are given by: 

(Nj) = &j 
((Nj-Nj) 2 ) - a, . (3.5) 



A. Gauge phase-space expansion 

The technical details are as follows. Define an extended 
(gauge) phase-space with a = (Q, a), and a weighted Pois- 
son distribution as p(oc) = po(ot)£2. Here £2 = oco is a 
complex-valued weighting factor which weights (or multi- 
plies) the usual normalized Poisson basis vector. The gauge 
expansion is defined for a real, positive distribution G( a ), as: 

P = J G(a )p(a )d 2 Qd 2d a . (3.1) 

I will show that this implies that a freedom of choice becomes 
available in the equivalent stochastic equations. Importantly, 
it then is possible to choose an equivalent stochastic equation 
of motion without instabilities or boundary terms. 

As with the standard Poisson representation, time-evolution 
is treated here by introducing differential identities, so that: 

— P(f) = I G(u)M-v(a)d 2 Q.d 2d a 

= J G(cc ) [L' G p(a )] d 2 Q.d 2d a . (3.2) 

Just as previously, the goal of the transformation is to allow 
partial integration, so that, provided the boundary terms van- 
ish, this equation has an equivalent form of: 



B. Gauge identities 

The extra variable £2 allows an additional differential iden- 
tity to be used to introduce a stochastic gauge — an arbitrary 
vector function in the extended phase-space with d + 1 com- 
plex dimensions. This can be used to stabilize the drift equa- 
tions throughout the extended phase-space, thus allowing in- 
tegration by parts. There is no free lunch here, however! The 
price paid is that there is a new stochastic equation in £2, lead- 
ing to a finite variance in the gauge amplitude £2. While this 
can cause practical problems due to sampling errors — which 
must be minimized — it is important to note that these errors 
can be estimated and controlled by choice of gauge and by in- 
creasing the number of sample trajectories. By contrast, there 
is no presently known technique of estimating and controlling 
boundary term errors in the standard Poisson expansion. 

The additional identity in the weight variable £2 has the sim- 
ple form of: 

p( a) = £23 n p("a ) . (3.6) 

To derive the stochastic gauge equations, I now introduce d' 
arbitrary complex drift gauge functions g = (g,-(oc,f)), to 
give a new differential operator l' g which is equivalent to the 
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usual Poisson operator l' p , but which includes £2 derivatives 
in the extended phase-space: 



L' G : 



j d d' 



[Cfio-1] . (3.7) 



To simplify notation, I have used do to symbolize either 
[3q = 8/3xo] or — i [3q = 3/3yo] for the complex weight vari- 
able £2 = oco = xo + iyo . This allows a choice of analytic 
derivatives, which is later used to obtain a positive definite 
Fokker-Planck equation. Since the added term has a factor 
[£ldo — 1] which vanishes when operating on the gauge basis 
p( a ), the gauge functions can be arbitrary, just as in the anal- 
ogous situation of gauge field symmetries in electrodynamics. 

Summing repeated Latin indices from now on over i — 
0,. . . ,d, Eq (3.7) becomes: 



+ -Djjdjdj 



(3.8) 



Here, the total complex drift vector, including gauge correc- 
tions, is A = (0, A\,. . .Ad), where: 



- A t-LskB jk [j,k>0]. 
k=\ 



(3.9) 



This remarkable result shows that as long as there is a non- 
vanishing noise term, the drift equation can be modified in an 
arbitrary way by adding a gauge term. 

The diffusion matrix changes as well. The total diffusion 
matrix D is a (d + 1 ) x ( d + 1 ) matrix, with a new [d + 1 ) x d' 
square root Bj. 



D 



£2g 
B 



£2gB r 
BB r 



BB 



(3.10) 



Thus, the (d + 1 ) x d' complex stochastic noise matrix B is as 
before, except with one added row: 



B = 



B 



(3.11) 



The additional row means that whenever a gauge term is 
added, a corresponding noise term appears in the equation of 
motion for the gauge amplitude variable £2. The details of this 
are derived next. 



C. Stochastic gauge equations 

So far, there is no restriction on which of the choices of an- 
alytic derivative is utilized to obtain the identities. This means 
that it is possible to use the free choice of equivalent identi- 
ties to give a differential operator which is entirely real and 



has a positive-definite diffusion. This procedure is also fol- 
lowed in the positive-P[4] and positive Poisson[3] representa- 
tions. Here it is extended[7] to include the gauge variable £2 
as well as the other variables. This is achieved by introduc- 
ing a 2(d + 1) dimensional real phase space (xo,yo, . . .Xd,yd), 
with derivatives d M , and separating B~B X + iW into its real 
and imaginary parts. A similar procedure is followed for 
A=A X + iA y . 

The choice for the analytic derivative, where 3, — ► d x or 
3, — > — id?, can now be made definite by choosing it so the 
resulting drift and diffusion terms are always real. In more 
detail, this corresponds to choosing: 



A t d t 
Dijdidj 



AWi+m, (3.12) 
BlB^j + BlB^+ix^y). 



At this point it is necessary to introduce a corresponding real 
drift vector J?^ and diffusion matrix which are defined 
on the 2(d+ 1) dimensional real phase space. Hence, the 
gauge differential operator can now be written explicitly in 
this equivalent real form, as: 



L' r = 



(3.13) 



where = is now positive semi-definite. This can be 
seen by writing <3_ as a 2(d + 1 ) x d' real matrix: 



E 



(3.14) 



so that the diffusion matrix is the square of a real matrix, with: 



By 



m T m r 



(3.15) 



Hence, choosing the analytic derivatives to give real terms 
in Lq generates a positive semi-definite diffusion operator on 
a real space of 2{d + 1) dimensions. Provided that one can 
integrate by parts, the full evolution equation is then: 

| P(0= J [L G G(a)]p(a)d 2 ^a . (3.16) 

Provided that one can integrate by parts, there is at least one 
solution for G which satisfies the positive-definite Fokker- 
Planck equation: 



dt 



G(a,t) 



-d^ + 



G(a,t) . (3.17) 



It is important to note that the crucial partial integration 
step is only permissible if the distribution is strongly enough 
bounded at infinity (| a I — > °°) so that all boundary terms van- 
ish. Just as in the positive-P expansion, this means that the 
distribution must be bounded in phase-space more strongly 
than all power laws in 1 jr as r — > °°, in order for the moments 
to be defined. There is an additional requirement that the dis- 
tribution vanishes faster than 1 /|£2| 2 as |Q| — ► <», since there 
is now an additional integration over d 2 Q. to be carried out. 
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However, the freedom to choose a gauge means that there 
are now ways to eliminate movable singularities from the drift 
equations corresponding to Jl^. I will show in examples given 
later that this removes boundary terms as well — as expected 
from earlier conjectures about the relation between boundary 
terms and drift singularities. 

The positive-definiteness of the diffusion matrix 23 implies 
that the Fokker-Planck equation is equivalent to a set of d + 
1 Ito stochastic differential equations, with d! real Gaussian 
processes £,•(?)• This central result can be written compactly 
using the complex variable form, as: 

— = fta&(0 , 

da ■ 

= Af(a)+B jk Mt)-g k } . (3.18) 

The noises have correlations )£;(?')) =8,-/8(f — ?'), and 
are uncorrected between time steps. Repeated noise indices 
are summed over k = l,d'. 

As with the Poisson representation, if the Stratonovich 
method is used, a modified drift term SL^ is generated where: 

*£ = Jfy-^Bpv 3 pV- (3.19) 

The resulting equations can be used directly in stable implicit 
central-difference algorithms. Care should be used here in 
differentiating the noise matrix r B iN . Since this includes the 
gauge, and is no longer an analytic function, the real and 
imaginary parts need to be treated separately. 



IV. ASYMPTOTICS AND BOUNDARY TERMS 

It is crucial to choose the drift gauge g so that the resulting 
distribution is more strongly bounded than any power-law in 
the radius, in order to remove boundary terms and ensure that 
all of the moments are well-defined. Amongst the gauges that 
achieve this goal, it is preferable to use one that minimises 
the sampling error. An empirical rule is that no determinis- 
tic trajectory can be allowed to reach the boundary in a finite 
time, even on a set of initial conditions with measure zero, as 
this is the signature [6] for a distribution with a power-law tail 
— which cannot be integrated by parts exactly, and has large 
sampling errors. 

However, this is not always a sufficient condition, since 
it does not take into account the radial dependence of the 
stochastic noise. In the generic binary reaction equation 
(2.45), it is cle ar that n oise term has at most linear radial 
growth, since ^2|aiOC2| < |oci| + |ot2| < £i|oti| + £2|oc2| < r, 
where r is defined as in Eq (2.19). More generally, in all cases 
studied here, the noise has radial components with no more 
than linear radial growth in B r j. In some cases, the radial 
noise can vanish. In general, this relatively slow growth in 
radial noise means that moments remain well-defined as long 
as there is no more than linear asymptotic growth in the radial 
drift. 



A. Minimal gauges 

A second criterion of practical significance, is to use a 
gauge that generates an attractive subspace on which the drift 
gauge vanishes. These gauges are called minimal gauges. If 
this condition is not satisfied, then the stochastic noise in the 
gauge amplitude Q. creates a relatively large and growing sam- 
pling error. This is not just an issue of mathematics, but also 
one of computational efficiency. In performing a numerical 
calculation, there are always some numerical errors. These 
are due to the finite nature of computers (and even human cal- 
culators). 

Thus, one has to estimate and minimize numerical errors 
due to round-off error, finite step-size in time, and sampling 
error due to the use of a finite sample of trajectories. In gen- 
eral, there is an optimum gauge which minimizes sampling 
errors, but even a non-optimal gauge can be used simply by 
increasing the number of sampled trajectories. 

B. Existence of gauges 

In this section, I demonstrate the existence of stabilizing 
gauges for systems with deterministically stable rate equa- 
tions. In later sections, the numerical simulation of a real- 
istic nonlinear master equation — which generates boundary 
term errors without a stabilizing gauge — is shown to give 
correct results within the sampling error when suitable gauges 
are used. The important issue, as always, is that the calcu- 
lated result must agree with the correct value within a known 
error-bar. 

The gauge must be chosen to stabilise the nonlinear drift 
equations. Just as in the simpler example of Eq (2.44), the 
drift equations can only have constant, linear and quadratic 
terms. Hence, the unmodified drift equation for the z'-th com- 
ponent can always be written as: 

7>0 j,k>0 

where all coefficients are real. 

Any deterministic instability for standard rate equations — 
in the subspace of real, positive a, — is generally ruled out by 
number conservation laws a priori. In this positive subspace, 
there is often a conservation law such that r = Li e i a i is con- 
served, and hence A r — 0. Here e, must be chosen appropri- 
ately, for example as the number of atoms in a given chemical 
species. The present equations are not always strictly number- 
conserving however, since reservoirs are allowed. Neverthe- 
less, I assume them deterministically stable, in the sense that 
in the positive subspace, the asymptotic Stratonovich drift 
must be bounded so that A r < ar . This does permit expo- 
nential growth, which certainly occurs in many cases. 

A movable singularity can occur in the analytic continu- 
ation of the rate equations, which are the precise equations 
found in the usual positive Poisson equations. It is then essen- 
tial to add a stabilizing gauge that removes any singularities if 
the resulting stochastic equations are to be accurate and useful 
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Initial v 


Final /j 


Rate R a 


Initial Diffusion 


Final Diffusion 








k 








1 





ka 








2 





taiOC2 


R" 








1 


k 








1 


1 


ka 








2 


1 


taiOC2 


R a 








2 


k 





R a 


1 


2 


ka 





Ro 


2 


2 


ka\a 2 


R a 


R a 



TABLE I: Different types of uni-directional reactions, classified by 
initial and final species numbers. 



— although the equations with singularities can still be used 
as approximate equations for large particle number. 

The Painleve conjecture[14] states that movable singular- 
ities are a generic property of the analytic continuations of 
nonlinear sets of equations, so it is to be expected that these 
will generally occur in quadratic equations of the form given 
in Eq (4.1). Each singularity has the signature that for at least 
one component j, it evolves to |(X/| = °° in a finite time to, 
and hence must typically have a leading term with an inverse 
power-law time-dependence with power pj for t < to: 



a; 



(<b-0 



(4.2) 



It is necessary to demonstrate the existence of gauge 
choices that eliminate these singularities. I now wish to 
demonstrate that stabilizing gauges always exist, provided that 
the deterministic rate equations are stable. This proof gives 
only minimal conditions for gauge-stabilization. Other stable 
gauges also exist, and may be more efficient in terms of sam- 
pling error in any given case. 



1. Amplitude gauge 

First, it is important to choose an appropriate diffusion 
gauge — that is, the choice of matrix square root of the dif- 
fusion matrix must be specified. This is most simply done 
by choosing to regard each different unidirectional reaction 
as having a distinct diffusion term proportional to the reaction 
rate, as in Eq (2.24). Unidirectional reactions are classified ac- 
cording to the number of initial species and final species. This 
leads to nine reaction types, as shown in Table (I), ignoring 
factors of order unity for simplicity. 

Diffusion terms occur only in reaction channels involving 
two particles. However, in all channels it is always possible to 
add a diffusion gauge so that the noise matrix is nonvanishing, 
from Eq (2.25). This diffusion gauge choice may not always 
be optimal for sampling purposes, but it is a possible choice, 
and there is a resulting drift gauge which is stable. 



(■(>; 



(4.3) 



Defining a phase angle (j), via: 
a,- = |a,|e 
every drift term has the form 

da t _ + 

= Af ] +Ay\Uj\e^ +Afjl\u j U k \e i ^ + ^ . (4.4) 
It is always possible to subtract the gauge term 

£B i/g/ = AjJ'oyll-^)] 
f 

+A<$a j a k [l (4.5) 

which cancels the original rate and replaces it by one at a 
phase angle (j), equal to the phase of a, . The deterministic part 
of such an equation can only modify the amplitude of a,, and 
so is effectively restricted to a J-dimensional real space, just 
as the usual deterministic rate equations are. But these equa- 
tions have an asymptotic linear radial bound by hypothesis. 
This gauge is therefore a stabilizing gauge. 

An example of this is for r — ka 2 , as in the dimerization 
equation (2.39) which has singular trajectories in the standard 
Poisson method. In this case, one can simply choose: 



gB = A + [\ - \a\/a] = ~ka[a- 

The gauged drift equation becomes: 

da/dt = A + -gB 
= -ka\a\ , 



oc 



(4.6) 



(4.7) 



which is clearly stable because the drift is directed toward the 
origin at all times, so A r < . 

Hence, the gauge corrected equations are stable. When the 
rate-equations have an attractor, this gauge tends to produce 
random circular paths of constant amplitude, instead of local- 
ized behavior in the complex phase-space. I will therefore 
refer to it as the 'amplitude' gauge. 



2. Phase gauge 

The amplitude gauge can be improved by modifying the 
gauge term so that it also stabilizes the phase near (j),- = 0. 
This reduces the size of the gauge-induced noise, and hence 
reduces sampling errors. A suitable choice is to add additional 
gauge terms of form: 



ia iai (^,\a\) . 



(4.8) 



Here the real function «(<(>, |a|) is defined to generate an at- 
tractor at (|) = 0, where «(<(>, |<x|) = 0, so the gauge corrections 
all vanish at zero phase. 
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As before, we can consider the dimerization equation 
(2.39), with rR = —ka 2 . In this situation, one can simply 
choose a(<j)) =ky/a, where a =x+iy, giving an overall gauge 
contribution of: 

gB = -ka[a - |oc|] + ikay = -ka[x - \a\] . (4.9) 

The gauged drift equation then becomes: 

da/dt = -ka(\a\+iy) . (4.10) 

The additional term has no effect on global stability, but in- 
creases the likelihood of trajectories near <|> = 0, since the 
phase gauge above generates a deterministic equation in the 
form dfy/dt « — (j). 

As the most likely trajectory is in-phase and has zero gauge 
correction, this gauge is minimal. Correspondingly, the gauge 
noise and resulting sampling errors are reduced, as I will show 
later in the numerical examples. This gauge will be called the 
'phase' gauge, as it stabilizes the phase-angle of a as well as 
the modulus. Although similar nonlinearities occur in spa- 
tially extended systems [15], more subtle gauge choices may 
be better in these cases. 



V. GENETIC MUTATION MASTER EQUATION 

To demonstrate how the positive Poisson method can be 
usefully employed, consider the important genetic problem of 
a stochastic master equation for the evolution of a finite pop- 
ulation, where the Nj are simply the populations of genotype 
j. A simple model for linear evolution through asexual repro- 
duction and mutation is of the form[8]: 

Xj 

Xi kj i Xi+Xj . (5.1) 

Here fcy is the birth rate, and kj is the death rate. It is some- 
times convenient to also define kf = £y fcy as the total birth 
rate and Qi as the mutation rate, where ku = (1 — Qi)kf . 
Defining [i] — (N\ , Ni ± 1 . . . ,A^), the corresponding mas- 
ter equation is: 

+£ W + l)P(N+[*]) 

+ Ifey(^-l)P(N-[j]). (5.2) 

hi 

This has well-known problems: the state-space may be 
very large, preventing a direct matrix solution. On the other 
hand, while the corresponding average rate-equations reduce 
to the widely-studied Eigen[8] quasi-species model, the rate- 
equations are unable to treat population fluctuations in small 
samples. 



An interesting exactly soluble case involves two species, 
with Q = 1, so that reproduction always leads to mutation. 
This has rates given by: 



ki 


= k 2 


= k 


hi 


= h\ 


= k m 


k n 


= k 2 2 


= 0. 



A. Stochastic equations 

The equivalent Ito stochastic equation is exact for all master 
equations of the form of Eq (5.2). It is: 

da ' 

-jf = -kjO-j + X>ya, + Y,Bj&(t ) . (5.4) 

Here, the noise matrix Bjt is determined from the sym- 
metrized diffusion matrix, Dy = [oc,'£y + Ctjkji], where: 

£8ftB;t = Ai. (5.5) 

k 

This leads immediately an important result: an initially 
Poissonian distribution is invariant under pure decay pro- 
cesses, but can develop increased fluctuations with non- 
Poissonian features due to birth and mutation events, de- 
scribed by the matrix Dy. In general, constructing the square 
root of a symmetric real matrix Dy is non-unique. The most 
powerful technique requires a matrix diagonalization through 
an orthogonal transformation, and may result in eigenvalues 
of either sign. If all the eigenvalues are positive, the result- 
ing fluctuations are super-Poissonian. If some are negative, 
at least one sub-Poissonian feature will occur, and a complex 
stochastic process will result. 

A simple example is obtained by considering the symmet- 
ric two-species case of Eq (5.3). In this case, the diffusion 
is entirely off-diagonal, and the Poisson representation ex- 
actly transforms a complicated master equation into a soluble 
stochastic equation. The Ito equations can be simplified fur- 
ther as follows, on introducing population sum and difference 
variables n± = (a,\ ±a 2 ) and k± = (k^fk m ) : 



= -k+n+ + \j2k m n^t,\(f) 

^— = -k-n- + i^2k m n + t, 2 {t) ■ (5.6) 
dt 

In this situation, there is an absorber at n + — 0, since any 
stochastic trajectory that reaches this value has a zero deriva- 
tive. This is due to the randomness of birth or death events, 
which mean that it is always possible for the random walk in 
this low-dimensional population space to finish at extinction. 
In addition, any initial differences between the two popula- 
tions decays, and is replaced by a strong sub-Poissonian cor- 
relation. This is due to the fact that all births must occur in a 
way that tends to equalize the two species that are present. 
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B. Means and correlations 

To calculate analytic solutions, standard Ito calculus can 
be used to obtain the exact time-evolution of correlations and 
expectation values. This is straightforward, since in Ito calcu- 
lus the noise terms are not correlated with the other stochas- 
tic variables at the same time, so (/((X,- )£.-(*)) = 0. Thus, 
the means and variances are all soluble from their respective 
time-evolution equations — which is also possible using the 
master-equation form. 

Equations for general correlations aTaT = (oC/OC;)? can 
be either calculated from the master equation, or from the 
stochastic equations. Defining Afcy = fcy — 8yfc; and using the 
rules of Ito calculus, one obtains: 

5 = X>yix7 

d(X[(Xj y ^ „ 

— = l^Akijaiai + l^Akuaiaj + Dij 



(5.7) 



In the symmetric two-species case, the initial means and cor- 
relations are defined as: iT+ — (n + (0))p, TO = (n_(0))/>, 

~r£ r = (n 2 + (0)) P ,n?L = (n 2 _(0)) P and JT^rT = (n+(0)n-(0)) P . 
Solving the moment equations (5.7) gives the following exact 
results: 



(n+{t))p 
(n-(t)) P 

(n 2 + (t)) P 

(n 2 -(t))p 
(n+{t)n-(t)) P 



n + e 



-k+t 



-k-t 



n 2 +e -2k +l +2kme -lk + t/2 w 

2k m 7T + ~ 



_sinh(fc+f/2) 



k + /2 



nle- 2 "- 1 



k + 3k„ 



-k+t 



-2k 



n+ri-e 



-2kt 



') 

(5.8) 



Suppose that birth and death rates are equal (k m — k), so the 
mean population (n+(t))p is time-invariant. Then the asymp- 
totic population differences have a mean and variance of: 



lim(AL(f)) = 
lim([jV_(f)] 2 ) = lim (n 2 _(t)+n+(t))p 



n+ 
2 



(5.9) 



This indicates that the population difference has a steady- 
state variance of half its usual value, owing to the fact that 
all births are correlated between the species, thus causing 
sub-Poissonian statistics — while all deaths are uncorrected, 
tending to restore the Poissonian distribution. By compari- 
son, the variance in the total population shows linear growth 
in this case, as some populations in the total ensemble become 
extinct, while others can randomly grow to a large population. 

In this case there are no unstable trajectories, and the posi- 
tive Poisson method can be used directly. However, it should 
be noted that this model ignores inter-species competition 




FIG. 1: Sampled mean populations (n+)p for genetic mutation ex- 
ample in the Poisson representation, parameters as in text, showing 
upper and lower one standard deviation error bounds. Sampling er- 
ror (c m ) is of order 10~ 3 or less. These results agree with the analytic 
theory (dotted line) within the sampling error. 



— which could lead to nonlinear effects involving boundary 
terms. 



C. Numerical results 

Applying the Stratonovich rules to generate equations for 
numerical stochastic integration results in: 



= -km/2 - k+n+ + v /2fc m n+^i (f ) 

at 
dn 



— = -k-n-+i^2k m n + t, 2 (t) • (5.10) 

This illustrates the typical feature of the Stratonovich cal- 
culus, which is the generation of terms in the drift equations 
due to the noise. Some care is needed in calculations near the 
absorbing boundary at n + = 0, which is treated by imposing 
an appropriate boundary condition at this point. 

Fig (1) shows the mean values obtained from numerical 
simulation of these equations for an ensemble of 10 6 trajecto- 
ries, showing that the exact analytic result is compatible with 
the upper and lower one standard deviation error bounds from 
the simulations. Results for cross-correlations are shown in 
Fig (2), also agreeing extremely well with the analytic predic- 
tions. The numerical and analytic results are indistinguishable 
at this graphic resolution. 

The stochastic equations were integrated for a total time of 
t = 5, using values of k = k m = 1. The minimum step-sizes 
used were At = 0.01 and Af = 0.005 (to enable a check on the 
errors due to finite time-steps, which were negligible). Initial 
values were set to n + = 5, n_ = 2, to give results in low popu- 
lation regions with large departures from Poissonian behavior. 
All simulation results agree well within the sampling error at 
t = 5, as shown in Table (II). 

In summary, all the simulation results are in excellent 
agreement with analytic predictions for this model. No bound- 
ary term errors are found using these linear equations, as one 
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actions are: 



H (IN) R H 
ki 



2H 



H 2 



2H - H * 2 



H 



H* 



H 2 yj > Hi 



(6.1) 



FIG. 2: Sampled moments of (n?_ )p , for genetic mutation example in 
the Poisson representation, parameters as in text. Sampling error is of 
order 10~ 3 or less. Negative values indicate sub-Poissonian statistics 
due to mutations that cause strong inter-species correlations. To this 
graphical resolution, the exact and numerically simulated results are 
indistinguishable. 



Moment 


Analytic 


Poisson 


{n+)p 


5.0 


5.002(7) 


{n-) P 


0.908 x 10- 4 


0.908(0) x 10- 4 


(n\) P 


75 


75.1(2) 


{nL) P 


-2.5 


-2.502(7) 


(n+n-} P 


0.4540 x 10~ 3 


0.4541(6) x 10~ 3 



TABLE II: Table of observed moments, comparing analytic and sim- 
ulated results for the genetic mutation equations in the Poisson ex- 



This describes hydrogen atoms H adsorbed onto a grain sur- 
face, and forming hydrogen molecules H 2 on the grain. The 
number of adsorbed atoms can grow via a generation rate (R) 
from an input flux //( w ) , until it reaches an equilibrium. This 
occurs due to losses from molecule formation with a total rate 
of k = k\ + k 2 , and from desorption (y), which stabilizes the 
concentration of H through emission of unbound hydrogen 
atoms H*. The concentration of H 2 is also stabilized by des- 
orption (y 2 ), leading to unbound hydrogen molecules H 2 . 

There are additional effects due to flux-blocking caused by 
adsorbed molecules and atoms, as well as dissociation pro- 
cesses — which are neglected here for simplicity. Interstellar 
grains have a distribution of sizes and compositions, which 
means that master equations like these need to be solved for 
a variety of parameter values to give the total molecular pro- 
duction rate. 

Of course, competing processes involving other atomic and 
molecular species can also occur, leading to an overall situa- 
tion of great complexity if all possible molecular species were 
included. Here I will focus on the elementary case of hydro- 
gen molecule production. Poisson variables 0Ci , OC2 , 0C3 , 0C4 can 



pansion at / = 5. Standard deviations a g for the last significant be introduced representing [H], [H 2 ], [H*], [#|] respectively. 

In the positive Poisson representation, this leads to the fol- 
lowing system of equations: 



digit are in brackets. These results for mean populations, variances 
and correlations demonstrate that in this case, the Poisson stochastic 
equations agree with the known analytic results within the sampling 
error of the simulations. 



might expect, since there is no possibility of movable singu- 
larities with linear drift equations. 



dUi 
dt 

da 2 
dt 

da?, 

dt 
da,4 
dt 



[R-yai-2ka\] +iaiV2kt^(t) 
feia?-y 2 a 2 



-— = k 2 ui + y 2 u 2 . 



(6.2) 



VI. ASTROPHYSICAL MOLECULAR HYDROGEN 
PRODUCTION 



It should be noted that the first equation can be solved in- 
dependently from the other ones — one can also show this at 
the level of the Fokker-Planck equation, by simply integrat- 
ing out all the other variables. The astrophysical molecular 
production rate of interest is: 



Stochastic gauges are only needed when the equations are 
nonlinear, which comes about when multi-component compe- 
tition or formation processes are present. To give a typical ex- 
ample of this, consider the astrophysically important problem 
of hydrogen recombination to form molecules on interstellar 
grain surfaces [9]. This is thought to be the major source of in- 
terstellar H 2 , and it is known that conventional rate equations 
are unable to describe this accurately, due to low occupation 
numbers at the critical step of dimer formation. The main re- 



R Hi = (k 2 aj + y 2 a 2 ) . (6.3) 

In this model, all hydrogen molecules created are even- 
tually desorbed, since in the steady-state (k\a\) = (y 2 OL 2 ). 
Hence, the total molecular production rate in the steady-state 
is obtainable from the solution to the first equation: 

% - fc(oci) 

= *<tfi(M-l)) • (6.4) 
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The important point of physics here is that the hydro- 
gen molecule production rate is proportional to the auto- 
correlation function of the hydrogen atom density — and 
hence requires a knowledge of the correlations and fluctua- 
tions present. This of course, has a simple physical origin, 
since hydrogen molecules can only form if at least two atoms 
are present simultaneously. 



Thus for large grains with e — > °° the high flux limit is just 
Rff* =R/2, which is also the rate-equation limit. At low fluxes 
(i.e., small grains) a dramatic and physically understandable 
feature is obtained: the //| production rate can be suppressed 
below the rate equation result. In this limit of e — > , 



(«) = 



R 



A. Analytic solutions 

For notational simplicity, I define n — (X\, which is the Pois- 
son variable that correspond to the hydrogen atom number. 
From the Poisson expansion viewpoint, the only non-trivial 
term is the hydrogen equation, as this introduces noise. All 
the other equations can be solved once the hydrogen number 
fluctuations are known. 

It is useful to obtain the steady-state hydrogen fluctuations 
from the complex Poisson representation defined in Eq (2.1 1), 
as this has an analytic solution for the steady state. The re- 
duced Fokker-Planck equation for the hydrogen atom vari- 
ables is simply: 



— (-R + jn + 2kn z ) -k^-^n 1 
On 0n L 



f(n,t) . (6.5) 



This has a steady-state which is exactly soluble, though de- 
fined on a complex contour starting and ending at the origin: 

/(n,oo) =Cn^ k - 2 ^exp(2n + ^j . (6.6) 

Here I have kept the derivatives in analytic form, to obtain 
the simplest potential solution. However, the result is instruc- 
tive, since it is clear that this analytic form is inherently com- 
plex. This is the essential reason why a gauge variable £2 is 
useful in order to get simulations that behave like this simple, 
compact solution. The gauge variable can attain complex val- 
ues during a stochastic calculation, even when the distribution 
itself is constrained to have positive values. 

In the case of complex valued solutions as in Eq (6.6) it is 
necessary to choose an appropriate integration contour to de- 
fine the manifold over which the analytic derivatives are de- 
fined. For simplicity, I introduce relative flux and relaxation 
parameters, £=R/ (2k) and p = J /2k. Next, using a Sommer- 
feld contour-integral identity in the inverse variable z= 1 jn , 
one obtains the result for the moments that: 



(n m ) =C 



r(o+) 

./ — CO 



= e ffl/2 /2 P +, n -i(4Vi)//2p-i(4Vi). (6.7) 

This exact solution gives the steady-state production rate 
(neglecting dissociation): R H * — fc(« 2 ) . 

An obvious result, coming from the asymptotic properties 
of Bessel functions, is that 



lim (n m ) 



p m/2 



R" 



lim(n M ) = - 

e^o x ' (y+k[m — 1J) x ... x (y) 



(6.8) 



<» 2 > = 



R l 



y{k + y) ■ 



(6.9) 



For » y, this predicts enormously reduced hydrogen 
molecule production rates compared to normal rate equations. 
The reason for this is simply that when there is only one atom 
at a time on the grain, no molecules are produced. Similar 
results have been found in earlier Monte Carlo calculations as 
well[9]. 



B. Poisson equations 

It is simplest to use a scaled time x = 2kt to calculate the 
stochastic equations in the Poisson representation for n = &\ . 
With this variable the drift and noise matrices are both scalars; 
the resulting Ito equations of motion are unstable in the ab- 
sence of gauge terms: 



dn 
~a\ 



= [e - pn - n 2 ] + inr\ (x) 



(6.10) 



where (r|(x)r|(x')) = 8(x-x'). 

There is a singular trajectory n — ► — °° which can be ac- 
cessed via the complex diffusion of n into the negative half- 
space of n < ; this is the instability already encountered in 
the solutions to the dimerization equation (2.39). For numeri- 
cal purposes, it is advantageous to use the Stratonovich form, 
suitable for central difference algorithms: 



dn 
d% 



= e-n[p- l/2 + n] + ;'nr|(x) 



(6.11) 



Results of the numerical simulations of the Stratonovich 
equations for hydrogen molecule formation problem in the 
standard Poisson representation, showing upper and lower 
one-standard deviation error-curves, are given in Fig (3) and 
(4). The results clearly show the problems caused by the dy- 
namical instabilities in these equations, which cause both a 
large sampling error, especially in (n 2 ), as well as systematic 
errors. This is especially noticeable in (n), which has a rela- 
tively low sampling error, and is systematically incorrect. The 
steady-state value for these parameters is (n) — 0.407.., which 
disagrees with the simulations by a margin much larger than 
the measured sampling error. The large sampling error in (n 2 ) 
is exactly what is expected from the inverse power law dis- 
tribution tails, which mean that the standard deviation in this 
moment is undefined. 

These equations are difficult to integrate numerically, ow- 
ing to the instabilities, and it is essential to integrate by al- 
ternating between n (for \n\ < 1 , and z — 1/n (for \n\ > 1) 
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FIG. 3: Sampled moments of (n) for astrophysical hydrogen 
molecule production in the Poisson representation, parameters as in 
text. Adjacent lines give upper and lower (±O g ) error bounds caused 
by sampling error. 



dn 



[e-pn-n 2 ] + in[l\(x) - g] . (6.12) 



For example, consider the effects of three different gauges 
which all stabilize the equations. The first two correspond to 
the amplitude [a] and phase [p] gauges treated in the previous 
section, described by Eq (4.6) and Eq (4.9) respectively. The 
third one is another stabilizing gauge which only acts in the 
left half-space of 9t(n) < 0, where the instabilities are located 
in this example. This is called the 'step' [s] gauge. 

Defining n = x + iy, the three stabilizing gauges considered 
are: 




FIG. 4: Sampled moments of (n 2 ) for astrophysical hydrogen 
molecule production in the Poisson representation, parameters as in 
text. Adjacent lines give upper and lower (±Og) error bounds caused 
by sampling error. 



in order to obtain stable numerical results. Numerical re- 
sults throughout this section were obtained using parameters 
of e = 0.1, p = 0.1 for which the analytic results can be easily 
calculated from the Bessel function representation. In this re- 
gion the rate equations break down, and occupation numbers 
are very small, which is a testing region of parameter space for 
these expansions — since the fluctuations are far from Pois- 
sonian. The integrations were for a total time of x = t = 40 
to allow an approximate numerical steady-state to be reached 
from an initial value of n = (for simplicity, a value of k = 1/2 
was taken). The minimum step-sizes used were At — 0.005 
and 0.0025. 



C. Stochastic gauges 

Fortunately, it is simple to stabilize these equations by 
adding non-analytic corrections to the drift. From the basic 
stochastic gauge equations (3.18), with a scalar gauge g, the 
resulting Ito equations for astrophysical hydrogen production, 
are: 



g a = i{n-\n\) [a] 
g p = i{x-\n\) \p] 
g s = 2ixQ(-x) [s] . 



(6.13) 



Noting that here B — in, these give rise to the following 
three Ito equations in phase space, each of which is manifestly 
stable at large \n\ : 

dn 

— = e-n[p+H] + wr|(x) [a] 
ax 

dn 

— = £-n[p + \n\+iy] + irtr\(%) \p] 



dn 
dx 



= e - n [p + \x\ + iy] + iwr\ (x) [s] . (6.14) 



For numerical integration, it is more efficient to transform 
to the Stratonovich form, and of course the gauge weight 
equations are necessary for weighting purposes. In the am- 
plitude gauge, the Stratonovich equations are: 



dD. 

~dx 

dn 

dx 



= £l[g a r\(T) + (n-g 2 a )/2\ 

= e-n[p-l/2 + |n|] + ;«r|(x) . (6.15) 



In the phase gauge, the Stratonovich equations are: 



nM(x) + (iy-$/2] 



dQ 

~dz 

dn 

— = e-«[p-l/2+H+;>] + ;«r|(x) . (6.16) 
ax 



In the step gauge, the equations are : 

= £-n[p-l/2 + |x|+/y] + ;nr|(x) . (6.17) 



da 

d% 
dn 

dx 



Clearly the first two equations only have inward drift vec- 
tors with d\n\/dx < at large enough \n\. The last, the step 
gauge, is similar, except that it shows linear growth if p < 1/2 
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FIG. 5: Sampled moments of ((«)) (upper plot) and {(n 2 )) (lower 
plot) for astrophysical hydrogen molecule production in the 'phase' 
gauge, parameters as in text. Adjacent lines give upper and lower 
(±O g ) error bounds caused by sampling error. 
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FIG. 6: Sampling errors: standard deviation G g in the mean of {{n 2 )) 
for astrophysical hydrogen in the phase gauge (lower curve) , and 
amplitude gauge (upper curve) . 



Moment 


Analytic 


Poisson 


Phase 


Amplitude 


Step 


<n> 


1.0 


1.0 


1.003(4) 


0.993(10) 


1.005(6) 


((»» 


0.407 . . . 


0.456(4) 


0.409(2) 


0.399(5) 


0.406(4) 


«" 2 » 


0.059... 


0.077(5) 


0.061(1) 


0.058(2) 


0.064(3) 



TABLE III: Table comparing analytic and simulated results for three 
different stochastic gauges and the Poisson expansion; the moment 
((n 2 )) is critical for molecule production. Sampling error (<j g ) in 
brackets. 



and x = 0. At worst this can only lead to a singularity in in- 
finite time, and in any event the y-axis is not an attractor: so 
growth along the y axis only leads to a temporary increase in 
radius, not a singularity. Hence, all three gauges are com- 
pletely stable, with no movable singularities. 



D. Numerical simulations 

As results in all three gauges were similar, apart from 
changes to the sampling error, I will only show graphs of the 
detailed results in the phase-stabilized gauge, using the same 
parameter values as previously. 

Results of the numerical simulations in the phase-stabilized 
gauge, showing upper and lower one-standard deviation error- 
curves, are given in Fig (5). It is clearly dramatically improved 
compared to the Poisson results. 

Fig (6) shows that there are reductions of up to four orders 
of magnitude in the sampling error of molecule production 
rates, relative to the Poisson method. 



E. Comparison of moments and sampling errors 

Apart from the unmodified Poisson or 'zero gauge' results, 
the gauge simulations are stable. Nevertheless, on closer in- 
spection, the stable gauges don't behave in an identical way 
as regards the sampling error with a finite set of trajectories. 



This can be seen from the previous figure, which compares 
two stable gauges. 

For each gauge and for the Poisson expansion, the observed 
moment and its sampling error a g (standard deviation in the 
mean) is given in Table (III), which tabulates the final near- 
equilibrium simulation results at X = 40, and compares them 
to the equilibrium analytic result for x = °°. For the stable 
gauges, the results are within a g of the analytic calculations 
in most cases, and are within 2a g in the remaining more ac- 
curate cases — where the residual discrepancy was partly due 
to the finite time-step discretization error of around ±10~ 3 . 
This indicates that all these (stable) gauges converge to the 
analytically known correct answer. 

The corresponding (unstable) Poisson method clearly gives 
incorrect answers due to boundary term and/or sampling er- 
rors, with up to I2a g discrepancy in the case of the mean 
number of hydrogen atoms, («) . The graphical and tabular 
evidence indicates that the mean atom number is incorrect be- 
cause the unstable trajectories cause power-law tails in the 
distribution, and consequent boundary term errors. In addi- 
tion, the graph shows that the Poisson time-history has large 
fluctuations with sampling errors of up to 1000%, showing no 
signs of equilibration for the molecule production rate, which 
is proportional to (n 2 ). This is further evidence for power-law 
tails, which are also found in a similar quantum-optical master 
equation. 

The amplitude gauge has no systematic errors, but gives the 
worst sampling error of the stable gauges, as the n variable 
is the least constrained in this gauge, tending to diffuse in a 
circle. For these parameters the phase gauge gives the best 
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results, as it localizes the n variable near a deterministic stable 
point. The last gauge is a step gauge — only giving non-zero 
corrections when x < 0. This has the feature that the gauge 
term Q. only changes when the trajectory reaches x < 0, and 
gives sampling errors intermediate between the others. 

One might expect that the step gauge would give lower sam- 
pling errors in (Q), since this gauge is zero in the right half- 
plane. Instead, the phase-stabilized gauge gives the lowest 
overall sampling errors for all quantities with these parameter 
values, even for the gauge amplitude (Q.) . This is an example 
of 'prevention is better than cure' . That is, phase-stabilization 
is also able to prevent amplitudes from growing along the ±y 
axis. The step gauge corrects this growth too late for optimal 
results, having to use a numerically bigger gauge correction 
— with larger sampling errors. 

VII. CONCLUSION 

The gauge Poisson method is shown to generate a stochastic 
differential equation that is exactly equivalent to a nonlinear 
master equation in certain cases. By comparison, the system- 
size expansion is only approximate, and the positive Poisson 
representation is not exact for problems which have bound- 
ary terms due to movable singularities. The gauge technique 
provides a way to eliminate boundary-term errors due to sin- 
gular trajectories. The price paid for this advantage is an extra 
stochastic gauge amplitude, which generates a sampling er- 
ror that grows in time. The focus of numerical simulations 
in this paper is on cases where the existence of exact analytic 
results allows the issue of random and systematic errors to be 
carefully investigated. While this is not a complete proof that 
boundary terms can be eliminated in all cases, it suggests that 
choosing a stabilizing gauge is a necessary condition. 



This type of model is very general. For example, one can 
easily include linear diffusion and extend the theory to treat 
fluctuations in reaction-diffusion models or even Boltzmann 
kinetics [3]. The method simply requires that the populations 
are defined as occurring in a lattice of bounded cells, either in 
ordinary space or in a classical phase-space, together with the 
appropriate hopping rates. The resulting deterministic equa- 
tions are just the same as would occur in discretized non- 
stochastic equations. However it is necessary to include cell- 
volume factors in the nonlinear rate constants, so that the noise 
terms vary with the j— th lattice cell volume V 7 - typically re- 
sulting in stochastic noise terms proportional to 1/ y/Vj. Ap- 
plications to these problems will be treated elsewhere. 

The technique can be easily extended to include a large 
number of coupled kinetic processes, as occurs both in genet- 
ics, and in the generation of chemical species of astrophysical 
importance: for example, OH, H2O , CO and so on. By con- 
trast, the direct solution of the master equation grows expo- 
nentially more complex as the number of interacting species 
increases. Similar considerations arise when treating biologi- 
cal species with typically very large numbers of genotypes, or 
when treating extended spatial (multi-mode) problems. The 
method of choosing stable gauges developed here may also be 
useful for the corresponding quantum problems. 
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